Search CORE

145 research outputs found

Parallel classification and feature selection in microarray data using SPRINT

Author: Akl SG
Breiman L
Ihaka R
Kotsiantis SB
Liaw A
Shafer JC
Smith CL
Topiẃc G
Publication venue: 'Wiley'
Publication date: 01/03/2014
Field of study

The statistical language R is favoured by many biostatisticians for processing microarray data. In recent times, the quantity of data that can be obtained in experiments has risen significantly, making previously fast analyses time consuming or even not possible at all with the existing software infrastructure. High performance computing (HPC) systems offer a solution to these problems but at the expense of increased complexity for the end user. The Simple Parallel R Interface is a library for R that aims to reduce the complexity of using HPC systems by providing biostatisticians with drop‐in parallelised replacements of existing R functions. In this paper we describe parallel implementations of two popular techniques: exploratory clustering analyses using the random forest classifier and feature selection through identification of differentially expressed genes using the rank product method

Durham Research Online

Crossref

PubMed Central

Edinburgh Research Explorer

Shallow vs deep learning architectures for white matter lesion segmentation in the early stages of multiple sclerosis

Author: A Carass
D Garcia-Lorenzo
E Geremia
M Styner
MJ Fartaria
MJ Fartaria
MJ Fartaria
NJ Tustison
R Kikinis
S Klein
S Valverde
SB Kotsiantis
SM Smith
T Brosch
Publication venue
Publication date: 01/01/2018
Field of study

In this work, we present a comparison of a shallow and a deep learning architecture for the automated segmentation of white matter lesions in MR images of multiple sclerosis patients. In particular, we train and test both methods on early stage disease patients, to verify their performance in challenging conditions, more similar to a clinical setting than what is typically provided in multiple sclerosis segmentation challenges. Furthermore, we evaluate a prototype naive combination of the two methods, which refines the final segmentation. All methods were trained on 32 patients, and the evaluation was performed on a pure test set of 73 cases. Results show low lesion-wise false positives (30%) for the deep learning architecture, whereas the shallow architecture yields the best Dice coefficient (63%) and volume difference (19%). Combining both shallow and deep architectures further improves the lesion-wise metrics (69% and 26% lesion-wise true and false positive rate, respectively).Comment: Accepted to the MICCAI 2018 Brain Lesion (BrainLes) worksho

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Serveur académique lausannois

Aiding first incident responders using a decision support system based on live drone feeds

Author: A Banerjee
Anna C Schapiro
CJ Burges
CM Bishop
GS Dotson
J Fürnkranz
J Fürnkranz
N Cristianini
OT Yildiz
R Chen
SB Kotsiantis
SK Murthy
TS Lim
Publication venue: Springer Singapore
Publication date: 01/01/2018
Field of study

In case of a dangerous incident, such as a fire, a collision or an earthquake, a lot of contextual data is available for the first incident responders when handling this incident. Based on this data, a commander on scene or dispatchers need to make split-second decisions to get a good overview on the situation and to avoid further injuries or risks. Therefore, we propose a decision support system that can aid incident responders on scene in prioritizing the rescue efforts that need to be addressed. The system collects relevant data from a custom designed drone by detecting objects such as firefighters, fires, victims, fuel tanks, etc. The drone autonomously observes the incident area, and based on the detected information it proposes a prioritized based action list on e.g. urgency or danger to incident responders

Crossref

Ghent University Academic Bibliography

Missing data is an issue in many real-world datasets yet robust methods for dealing with missing data appropriately still need development. In this paper we conduct an investigation of how some methods for handling missing data perform when the uncertainty increases. Using benchmark datasets from the UCI Machine Learning repository we generate datasets for our experimentation with increasing amounts of data Missing Completely At Random (MCAR) both at the attribute level and at the record level. We then apply four classification algorithms: C4.5, Random Forest, Naïve Bayes and Support Vector Machines (SVMs). We measure the performance of each classifiers on the basis of complete case analysis, simple imputation and then we study the performance of the algorithms that can handle missing data. We find that complete case analysis has a detrimental effect because it renders many datasets infeasible when missing data increases, particularly for high dimensional data. We find that increasing missing data does have a negative effect on the performance of all the algorithms tested but the different algorithms tested either using preprocessing in the form of simple imputation or handling the missing data do not show a significant difference in performance

Crossref

University of East Anglia digital repository

Car make and model recognition under limited lighting conditions at night

Author: A Psyllos
B Zhang
D Zhang
DR Amancio
G Chandrashekar
G Dougherty
H Emami
IS Oh
J Bernardo
M Kafai
N Boonsim
Noppakun Boonsim
R Baran
R O’Malley
S Du
S Lee
SB Kotsiantis
Simant Prakoonwit
SS Khan
TD Raty
TM Cover
V Vapnik
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Car make and model recognition (CMMR) has become an important part of intelligent transport systems. Information provided by CMMR can be utilized when license plate numbers cannot be identified or fake number plates are used. CMMR can also be used when a certain model of a vehicle is required to be automatically identified by cameras. The majority of existing CMMR methods are designed to be used only in daytime when most of the car features can be easily seen. Few methods have been developed to cope with limited lighting conditions at night where many vehicle features cannot be detected. The aim of this work was to identify car make and model at night by using available rear view features. This paper presents a one-class classifier ensemble designed to identify a particular car model of interest from other models. The combination of salient geographical and shape features of taillights and license plates from the rear view is extracted and used in the recognition process. The majority vote from support vector machine, decision tree, and k-nearest neighbors is applied to verify a target model in the classification process. The experiments on 421 car makes and models captured under limited lighting conditions at night show the classification accuracy rate at about 93 %

Crossref

Springer - Publisher Connector

Bournemouth University Research Online

Rapid Diagnostic Algorithms as a Screening Tool for Tuberculosis: An Assessor Blinded Cross-Sectional Study

Author: A Ali-Gombe
A Fares
A Sita-Lumsden
A Ustianowski
Alexandra Indra
B Thiede
Bernhard Parschalk
BJ Marais
C Lange
CC Boehme
CJ Clopper
CK Liam
D Agranoff
Delmiro Fernandez-Reyes
E Harju
Franz Ratzinger
G Walzl
H Getahun
Harald Bruckschwaiger
Heimo Lagler
J Nemeth
J Zhang
K Fassbender
KP Cain
M Glennon
M Sester
Martin Wischenbart
MB Miller
MD Perkins
Michael Ramharter
Olivier Neyrolles
OP Sharma
P Papay
R McNerney
S Le Cessie
Sanjeev Krishna
SB Kotsiantis
Stefan Winkler
SV Balasingham
T Fawcett
T Tanaka
Wolfgang Graninger
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Background: A major obstacle to effectively treat and control tuberculosis is the absence of an accurate, rapid, and low-cost diagnostic tool. A new approach for the screening of patients for tuberculosis is the use of rapid diagnostic classification algorithms. Methods: We tested a previously published diagnostic algorithm based on four biomarkers as a screening tool for tuberculosis in a Central European patient population using an assessor-blinded cross-sectional study design. In addition, we developed an improved diagnostic classification algorithm based on a study population at a tertiary hospital in Vienna, Austria, by supervised computational statistics. Results: The diagnostic accuracy of the previously published diagnostic algorithm for our patient population consisting of 206 patients was 54% (CI: 47%–61%). An improved model was constructed using inflammation parameters and clinical information. A diagnostic accuracy of 86% (CI: 80%–90%) was demonstrated by 10-fold cross validation. An alternative model relying solely on clinical parameters exhibited a diagnostic accuracy of 85% (CI: 79%–89%). Conclusion: Here we show that a rapid diagnostic algorithm based on clinical parameters is only slightly improved by inclusion of inflammation markers in our cohort. Our results also emphasize the need for validation of new diagnostic algorithms in different settings and patient populations

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

Publikationsserver der Universität Tübingen

PubMed Central

St George's Online Research Archive

Possibilistic classifiers for numerical data

Author: A Pérez
B Haouari
D Dubois
D Dubois
D Dubois
D Dubois
E Hüllermeier
E Hüllermeier
Henri Prade
I Jenhani
J Beringer
J Demsar
JR Quinlan
Khaled Mellouli
LA Zadeh
Mathieu Serrurier
Myriam Bounhas
N Ben Amor
N Friedman
P Domingos
R Solomonoff
SB Kotsiantis
TM Cover
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

International audienceNaive Bayesian Classifiers, which rely on independence hypotheses, together with a normality assumption to estimate densities for numerical data, are known for their simplicity and their effectiveness. However, estimating densities, even under the normality assumption, may be problematic in case of poor data. In such a situation, possibility distributions may provide a more faithful representation of these data. Naive Possibilistic Classifiers (NPC), based on possibility theory, have been recently proposed as a counterpart of Bayesian classifiers to deal with classification tasks. There are only few works that treat possibilistic classification and most of existing NPC deal only with categorical attributes. This work focuses on the estimation of possibility distributions for continuous data. In this paper we investigate two kinds of possibilistic classifiers. The first one is derived from classical or flexible Bayesian classifiers by applying a probability–possibility transformation to Gaussian distributions, which introduces some further tolerance in the description of classes. The second one is based on a direct interpretation of data in possibilistic formats that exploit an idea of proximity between data values in different ways, which provides a less constrained representation of them. We show that possibilistic classifiers have a better capability to detect new instances for which the classification is ambiguous than Bayesian classifiers, where probabilities may be poorly estimated and illusorily precise. Moreover, we propose, in this case, an hybrid possibilistic classification approach based on a nearest-neighbour heuristics to improve the accuracy of the proposed possibilistic classifiers when the available information is insufficient to choose between classes. Possibilistic classifiers are compared with classical or flexible Bayesian classifiers on a collection of benchmarks databases. The experiments reported show the interest of possibilistic classifiers. In particular, flexible possibilistic classifiers perform well for data agreeing with the normality assumption, while proximity-based possibilistic classifiers outperform others in the other cases. The hybrid possibilistic classification exhibits a good ability for improving accuracy

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

OPUS - University of Technology Sydney

Open Archive Toulouse Archive Ouverte

Combining machine learning and metaheuristics algorithms for classification method PROAFTN

Author: A Ban
A Ishizaka
AT Brasil Filho
B Roy
B Roy
C Zopounidis
C Zopounidis
D Dubois
D Monekosso
D Rav
E-G Talbi
EG Talbi
ESM El-Alfy
F Al-Obeidat
F Al-Obeidat
F Al-Obeidat
F Al-Obeidat
F Al-Obeidat
F Al-Obeidat
F Bergh van den
F Sobrado
FW Glover
H Witten
I Sassi
J Ching
J. R. Quinlan
K Crammer
M Doumpos
M Goebel
N Belacel
N Belacel
N Belacel
N Belacel
N Belacel
N Belacel
N Singh
P Hansen
P Hansen
P Perny
P Vincke
S Garcia
S García
S Kotsiantis
SB Kotsiantis
Suneth Ranasinghe
T Marchant
X Wu
Publication venue: ZU Scholars
Publication date: 01/01/2019
Field of study

© Crown 2019. The supervised learning classification algorithms are one of the most well known successful techniques for ambient assisted living environments. However the usual supervised learning classification approaches face issues that limit their application especially in dealing with the knowledge interpretation and with very large unbalanced labeled data set. To address these issues fuzzy classification method PROAFTN was proposed. PROAFTN is part of learning algorithms and enables to determine the fuzzy resemblance measures by generalizing the concordance and discordance indexes used in outranking methods. The main goal of this chapter is to show how the combined meta-heuristics with inductive learning techniques can improve performances of the PROAFTN classifier. The improved PROAFTN classifier is described and compared to well known classifiers, in terms of their learning methodology and classification accuracy. Through this chapter we have shown the ability of the metaheuristics when embedded to PROAFTN method to solve efficiency the classification problems

ZU Scholars (Zayed University)

NRC Publications Archive

Crossref